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Abstract. In this paper we develop a discrete Hierarchical Basis (HB) to efficiently solve the 
Radial Basis Function (RBF) interpolation problem with variable polynomial order. The HB forms 
an orthogonal set and is adapted to the kernel seed function and the placement of the interpolation 
nodes. Moreover, this basis is orthogonal to a set of polynomials up to a given order defined on the 
interpolating nodes. We are thus able to decouple the RBF interpolation problem for any order of the 
polynomial interpolation and solve it in two steps: (1) The polynomial orthogonal RBF interpolation 
problem is efficiently solved in the transformed HB basis with a GMRES iteration and a diagonal, 
or block SSOR preconditioner. (2) The residual is then projected onto an orthonormal polynomial 
basis. We apply our approach on several test cases to study its effectiveness, including an application 
to the Best Linear Unbiased Estimator regression problem. 
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1. Introduction. The computational cost for extracting RBF representations 
can be prohibitively expensive for even a moderate amount of interpolation nodes. For 
an TV-point interpolation problem using direct methods it requires 0{N'^) memory and 
0{N^) computational cost. Moreover, since many of the most accurate RBFs have 
globally supported and increasing kernels, this problem is often badly conditioned 
and difficult to solve with iterative methods. In this paper we develop a fast, stable 
and memory efficient algorithm to solve the RBF interpolation problem based on the 
construction of a discrete HB. 

Development of RBF interpolation algorithms has been widely studied in scien- 
tific computing. In general, current fast solvers are not yet optimal. One crucial 
observation of the RBF interpolation problem is that it can be posed as a discrete 
form of an integral equation. This insight allows us to extend the techniques origi- 
nally introduced for integral equations to the efficient solution of RBF interpolation 
problems. 

RBF interpolation has been studied for several decades. In 1977 Duchon [23] 
introduced one of the most well known RBFs, the thin-plate spline. This RBF is 
popular in practice since it leads to minimal energy interpolant between the interpo- 
lation nodes in 2D. In [21] Franke studied the approximation capabilities of a large 
class of RBFs and concluded that the biharmonic spline and the multiquadric give the 
best approximation. Furthermore, error estimates for RBF interpolation have been 
developed by Schaback et al. [SHUSOllSl] and more recently by Narcowich et al. [38] , 

RBFs are of much interest in the area of visualization and animation. They 
have found applications to point cloud reconstructions, denoising and repairing of 
meshes [M]. In general, they have been used for the reconstruction of 3-D objects 
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and deformation of these objects [501 ■ Although we notice that for these areas of 
apphcations it is usuahy sufficient to consider zero- and hnear-order polynomials in 
the RBF problems. Popular applications, such as Neural Networks and classification 
[46] , boundary and finite element methods [lU [22] , require consideration of higher- 
order polynomials. 

More recently, the connection between RBF interpolation. Generalized Least Squares 
(GLSQ) [49] and its extension to the Best Unbiased Linear Estimator (BLUE) prob- 
lem has been established [39l [33l [32] . If the covariance matrix of a GLSQ (and BLUE) 
problem is described by a symmetric kernel matrix of an RBF problem among other 
conditions, the two problems become equivalent. Although GLSQ is of high interest 
to the statistics community, as shown by the high number of citations of [IS] , the lack 
of fast solvers limits its application to small to medium size problems f331 [32] ■ More- 
over, many of these statistical problems involve higher than zero- and linear-order 
polynomial regression [571 (iSl HZl [33 [33 [M] • By exploiting the connection between 
GLSQ and RBFs, we will be able to solve GLSQ using the fast solvers developed in 
the RBF and integral equation communities. 

For the BLUE Kriging estimator there is less need of higher order polynomials. 
In many cases quadratic is sufficient for high accuracy estimation. For the matlab 
DACE Kriging toolbox [331 [31] the highest polynomial accuracy is set to quadratic. 
In our paper we demonstrate the difference between constant, linear and quadratic 
BLUE estimates. As shown in the results in Section[4l the quadratic interpolant leads 
to much better estimate than constant and linear ones. In addition, in [30j the author 
uses second order polynomial BLUE for repairing surfaces. 

Recently Gumerov et al. [27] developed a RBF solver with a Krylov subspace 
method in conjunction with a preconditioner constructed from Cardinal functions. 
We note that this approach, to our knowledge, is the state of the art for zero-order 
interpolation in with a biharmonic spline. This makes it very useful for interpola- 
tion problems in computer graphics. On the other hand, its application to regression 
problems such as GLSQ is rather limited. 

A domain decomposition method was developed in [8] by Beatson et al. This 
method is a modification of the Von Neumann's alternating algorithm, where the 
global solution is approximated by iterating a series of local RBF interpolation prob- 
lems. This method is promising and has led to (coupled with multi-pole expansions) 
0(A^log(A^)) computational cost for certain interpolation problems. 

Although the method is very efficient and exhibits 0(iVlog(A^)) computational 
complexity, this seems to be true for small to medium size problems (up to 50,000 
nodes in R'^) with smooth data. Beyond that range the computational cost increases 
quadratically as shown in [8]. Other results for non smooth data shows that the 
computational complexity is more erratic [1 5j . Furthermore, in many cases, it is not 
obvious how to pick the optimal domain decomposition scheme. 

An alternative approach was developed by Beatson et al. [7], which is based on 
preconditioning and coupled with GMRES iterations [48^. This approach relies on 
the construction of a polynomial orthogonal basis, similar to the HB approach in 
our paper. This approach gives rise to a highly sparse representation of the RBF 
interpolation matrix that can be very easily preconditioned by means of a diagonal 
matrix. The new system of equations exhibits condition number growth of no more 
than 0{\ogN). The downside is that this basis is not complete. This is ameliorated 
by the introduction of non decaying elements, but no guarantees on accuracy can be 
made. 
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Our approach is based on posing the RBF interpolation problem as a discretiza- 
tion of an integral equation and applying preconditioning techniques. This approach 
has many parallels with the work developed by Beatson et al. [7]. However, our 
approach was developed from work done for fast integral equation solvers. 

Most of the work in the area of fast integral equation solvers has been restricted 
to the efficient computation of matrix vector products as part of an iterative scheme. 
For the Poisson kernel the much celebrated multi-pole spherical harmonic expansions 
leads to a fast summation algorithm that reduces each matrix-vector multiplication 
to 0{N) computational steps [551 El- This technique has been extended to a class 
of polyharmonic splines and multiquadrics [5l[2^. More recently L. Ying et al. has 
developed multipole algorithms for a general class of kernels [5^. In contrast, the 
development of optimal (or good) preconditioners for integral equations has been 
more limited. 

A unified approach for solving integral equations efficiently was introduced in 
[3 [21 [5]. A wavelet basis was used for sparsifying the discretized operator and 
only 0{N log2{N)) entries of the discretization matrix are needed to achieve opti- 
mal asymptotic convergence. The downside is that it was limited to ID problems. 

In [IB] a class of multiwavelets based on a generalization of Hierarchical Basis 
(HB) functions was introduced for sparsifying integral equations on conformal surface 
meshes in M"^. These wavelets are continuous, multi-dimensional, multi- resolution and 
spatially adaptive. These constructions are based on the work on Lifting by Schroder 
and Sweldens [52j and lead to a class of adapted HB of arbitrary polynomial order. 
A similar approach was also developed in [55] . 

These constructions provide compression capabilities that are independent of the 
geometry and require only 0(iV log4'^(iV)) entries to achieve optimal asymptotic con- 
vergence. This is also true for complex geometrical features with sharp edges. More- 
over, this basis has a multi-resolution structure that is related to the BPX scheme 
[42], making them an excellent basis to precondition integral and partial differential 
equations. In [28] Heedene et al. demonstrate how to use this basis to build scale 
decoupled stiffness matrices for partial differential equations (PDEs) over non uniform 
irregular conformal meshes. 

In this paper, we develop a discrete HB for solving isotropic RBF interpolation 
problems efficiently. Our HB construction is adapted to the topology of the interpo- 
lating nodes and the kernel. This new basis decouples the polynomial interpolation 
from the RBF part, leading to a system of equations that are easier to solve. With 
our sparse SSOR [25l|3T] or diagonal preconditioner, combined with a fast summation 
method, the RBF interpolation problem can be solved efficiently. 

Our contributions include a method with complexity costs similar to Gumerov 
et al [27] for problems in R"^, albeit for the biharmonic spline with constant order 
their approach shows efficiency superior to ours. However, due to the decoupling of 
the polynomial interpolation, our approach is more flexible and works well for higher 
order polynomials. We show similar results for the multiquadric RBFs in R'^ (we did 
not observe multiquadric results for R'^ in TT' and to our knowledge we have not 
found any) . Note that the idea of decoupling the RBF system of equations from the 
polynomial interpolation has also been proposed in |53| and [S]. 

In the rest of Section [T] we explicitly pose the RBF interpolation problem. In 
Section [21 we construct an HB that is adapted to the interpolating nodes and the 
kernel seed function. In Section [H] we demonstrate how the adapted HB is used to 
form a multi-resolution RBF matrix, which is used to solve the interpolation problem 
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efBciently. In section 21 we show some numerical results of our method. The inter- 
polating nodes are randomly placed, moreover the interpolating values themselves 
contain random noise. We summarize our conclusions in section [51 

After we submitted this paper we became aware of the H-Matrix approach by 
Hackbusch [11] applied to stochastic capacitance extraction [61] problem. In [lO] 
the authors apply an H-matrix approach to sparsify the kernel matrix arising from a 
Gaussian process regression problem to 0{N log N). In our paper, we apply HB to 
precondition the RBF system, although we could also use them to sparsify it. Instead, 
we use a fast summation approach to compute the matrix-vector products. 

There are many similarities between our approach and the H-matrix method. We 
shall discuss this topic more in detail in the conclusion in Section |5l 

1.1. Radial Basis Function Interpolation. In this section we pose the prob- 
lem of RBF interpolation for bounded functions defined on M^. Although our expo- 
sition is only for M"^, the RBF problem and our HB approach can be easily extended 
to any finite dimensions. 

Consider a function f{x) : M"^ — !> M and its evaluation on a set of user-specified 
sampling of distinct nodes X {xi, xn} C M'^, where x = [xi, X2, 2:3]^, unisolvent 
with respect to all polynomials of degree at most m. We are interested in constructing 
approximations to f{x) of the form 

M{m) N 

where : x r3 ^ ^ e rA^^ c e R^'^M and P := {qi{x), . . . ,qM{^){x)} is 
a basis for 7'"'(R^), i.e. the set of all polynomials of total degree at most m in R'^ 
(Note that M{m) is the number of polynomials that form a basis for 7'™(R^) i.e. 
M{m) = (™^'^) )• This interpolant must satisfy the following condition 

^ f{x.j), j = l,...,N, 

for all Xj in X. Moreover, to ensure the interpolation is unique we add the following 
constraint 



N 

(1.1) ^w[j-]g(f,)=0, 

i=i 

for all polynomials q{x) of degree at most m. Now, since M{m) is the minimum 
amount of nodes needed to solve the polynomial problem, we need at least N ^ M{m) 
RBF centers. The interpolation problem can be rewritten in matrix format as 



(1.2) 



K Q \ ( u \ _ ( d 

g« o M c ] - I 



where Ki^j — K{xi,Xj) with i,j running from 1 to n; d £ R^ such that dj — f{xj); 
c G M^^C™); and Qi.j = Qjixi) with i running from 1 to N, j runs from 1 to M{m). 
Denote the columns of Q as [qi, . . . , qM{7n}]- 
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This is the general form of the RBF interpolation isotropic problem. The proper- 
ties of this approximation mostly depend on the seed function K{x,y). An example 
of a well known isotropic kernel in is the biharmonic spline 

(1.3) K{x,Xj) :=X(|f-fj|) = \x~Xj\. 

This is a popular kernel due to the optimal smoothness of the interpolant [S]. This 
kernel has been successfully applied in point cloud reconstructions, denoising and 
repairing of meshes [14) . More recently, there has been interest in extensions to 
anisotropic kernels [16l [17] , i.e. 

K{x,x,) := K {\T,{x - xj)\) , 

where is a 3 x 3 matrix. The stabilization method introduced in this paper can be 
extended to solving efficiently the RBF problem with spatially varying kernels. By 
using the sparsification properties of the adapted HB a sparse representation of the 
spatially varying RBF matrix can be constructed in optimal time. However, in this 
paper we restrict the analysis to isotropic kernels in M.^, i.e. Tj = al where a > 0. 

One aspect of RBF interpolation is the invertibility of the matrix in Equation 
([1.2p . In [37] it is shown that the interpolation problem (|1.2[) has a unique solution if 
we assume that the interpolating nodes in X are unisolvent with respect to 
and the continuous kernel is strictly conditionally positive (or negative) definite. Be- 
fore we give the definition, we provide some notation. 

Definition 1.1. Suppose that X C M.^ is a set of interpolating nodes and 
{qi{x),q2{x), . . . ,qM(m){^)} o, basis for 7""(R'^), then we use V"^{X) to denote 
the column space of Q . 

We now assume the kernel matrix K satisfies the following assumption. 

Definition 1.2. We say that the symmetric function if (•, •) : x — > M 
is strictly conditionally positive definite of order I if for all sets X CZ M.^ of distinct 
nodes 

N 

Kv = ViVjK{xi, Xj) > 0, 

for all V e M.^ such that v 1. V^{X) and v ^ Q. Alternatively, under the same 
assumptions, K{-, •) : x — s> M is strictly conditionally negative definite if 

v"Kv < 0, 

for all w e such that w _L 'P'(X). 

Remark 1 . Many Kernels, including hiharmonics and multiquadrics, satisfy the 
strictly conditionally positive or negative assumption. By a theorem due to Micchelli 
\87l , this assumption is satisfied for all continuous kernels with completely monotonic 
derivative on (0,oo). 

The invertibility of the RBF interpolation problem can be proven by the basis 
construction developed in this paper. Although this is not necessary, it does cast 
insights on how to construct a basis that can solve the RBF Problem ([1.2[) efficiently. 
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1.2. Decoupling of the RBF interpolation problem. Suppose there exists 
a matrix T : M^-*^ M^, where M := dim{'P"'{X)), such that T" annihilates 
any vector v G (i.e. T"v = Vw G V™'{X)). Furthermore, suppose there 

exists a second matrix L : R*^ — )• such that the combined matrix P := [L T] is 
orthonormal such that : maps onto 

V^iX) © W, 

where := iV"'{X))^. 

Suppose that u e V"'{X)-^, then u = Tw for some w; £ R^-*^. Problem pT^ 
can now be re-written as 

T"KTw + T^Qc = T^d. 
However, since the columns of Q belong in 7-""(X) then 



(1.4) T"KTw^T"d. 

From Definition 11.21 and the orthonormality of P we conclude that w can be solved 
uniquely. The second step is to solve the equation L^Qc = d — KTw. From the 
unisolvent property of the nodes X the matrix Q has rank dim{V"^ {M.^^)) , moreover, 
L also has rank dim{P'^{M.^)), thus L^Q has full rank and it is invertible. 

Although proving the existence of P and hence the uniqueness of the RBF problem 
is an interesting exercise, there are more practical implications to the construction of 
P. First, the coupling of Q and K can lead to a system of ill-conditioned equations 
depending on the scale of the domain [8] . The decoupling property of the transform 
P leads to a scale independent problem, thus correcting this source of ill-conditioning. 
But more importantly, we focus on the structure of KT and how to exploit it to 
solve the RBF interpolation problem (|1.2p efficiently. The key idea is the ability of 

to vanish discrete polynomial moments and its effect on the matrix K {■,■). We 
shall now restrict our attention to Kernels that satisfy the following assumption. 

Assumption 1. Let D'i := i^^^,' „ldt^ and similarly for D^,, we assume that 



where a = (ai, a2, as) € Z"^ , |a| = ai + a2 + 0.3, and q E Z. In addition, we assume 
that K{x, •) and K{-,y) are analytic everywhere except for x = y. This assumption 
is satisfied by many practical kernels, such as multiquadrics and poly harmonic splines 

[21 [g. 

2. Adapted Discrete Hierarchical Basis Constructions. In this section we 
show how to construct a class of discrete HB that is adapted to the kernel function 
K(-, •) and to the local interpolating nodes (or interpolating nodes) contained in X. 
The objective is to solve RBF interpolation Problem (|1.2|) efficiently. The HB method 
will be divided into the following parts: 

• Multi-resolution domain decomposition. The first part is in essence a 
pre-processing step to build cubes at different levels of resolution as place 
holders for the interpolation nodes belonging to X. 
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Adapted discrete HB construction. From the multi-resolution domain 

decomposition of the interpolating nodes in X , an adapted multi-resohition 
basis is constructed that annihilates any polynomial in PP{X), where p G Z+ 
and m. p will be in essense the order of the Hierarchical Basis, which is 
not to be confused with m, which is the order of the RBF problem. 
GMRES iterations with fast summation method. With the adapted 
HB a multi-resolution RBF interpolation matrix is implicitly obtained through 
a fast summation method and solved iteratively with a GMRES algorithm 
and an SSOR or diagonal preconditioner. 



2.1. Multi-resolution Domain Decomposition. The first step in construct- 
ing an adapted HB is to rescale and embed the interpolating nodes in X into a cube 
Bq := [—0.5,0.5]^. The next step is to form a series of level dependent cubes that 
serve as place holders for the interpolating nodes at each level of resolution. 

The basic algorithm is to subdivide the cube Bq into eight cubes if \Bq\ > M{p), 
where \Bl\ denotes the total number of interpolating nodes contained in the cube 
Bj. Subsequently, each cube is sub-divided if > M{p) until there are at 
most M{p) interpolating nodes at the finest level. The algorithm is explained more 
in detail in the following pseudo-code: 

Input: X := {xi,X2, . ■ . ,xn }, M{p) 
Output: Bi Vfc e {X;(0), . . . , }C{n)}, n 
begin 

pre-processing; 

q ^ max5,,5^expj - 
forall the xi ^ X do 



N 



C 

end 

for i 1 to # do 



-0.5,0.5]3; 

{0}; 



I Xi 

end 

}C{0) ^ 
main; 

while \bI\ > M{p) for any k € /C(j) do 

1C{3 + 1)^0; 
for /c ^ to |/C(j)| do 
if \Bl\> M{p) then 

subdivide into eight cubes {B^^^ 



m + 1) 



8k ' ■ 

/C(i + 1) U^^o 8fc + w; 



}; 



end 
end 

J ^ J + 1 ; 

end 
end 

Algorithm 1: Multi- resolution Domain Decomposition 
Remark 2. fC{j) is an index set for all the cubes at level j. We use |/C(j)| to 
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denote the cardinality ofJC{j). 

Remark 3. Finding the maximal distance between any two nodes can be per- 
formed in 0{N{n-\-l)) computational steps by applying an octree algorithm. Therefore 
the Multi-resolution Domain Decomposition algorithm can be performed in 0{N{n + 
1)) computational steps. This can be easily seen since the maximum number of boxes 
at any level j is bounded by N and there is a total o/ n + 1 levels. 

Before describing the construction of the adapted discrete HB, we introduce some 
more notations to facihtate our discussion. 

Definition 2.1. Let Bj be the set of all the cubes Bj^ at level j that contain at 
least one interpolating center from X . 

Definition 2.2. Let C := {ci, . . . , bat}, where eji] — 1 and ejj] — Q if i ^ j . 
Furthermore, define the bijective mapping Fp : C X such that Fp{ei) = Xi, for 
i = 1 . . .N and : C — ?• Z+ s.t. Fq{ei) — i. Now, for each cube g Bn at the 
finest level n, let 

:= {e, I Fp(e.) G B^}. 
Definition 2.3. LetCn :=UfceK(«)Bfc- 

Definition 2.4. Let children{Bj,) be the set of nonempty subdivided cubes 
{Bt\...,Bj^^}eB, ofBl 

Definition 2.5. For every non empty B^ let the set parent{Bl) := {Bj ^ G 
B.j-1 I Bl e children{Bf-'^)}. 

2.2. Basis Construction. From the output of the multi-resolution decompo- 
sition Algorithm [T] we can now build an adapted discrete HB that annihilates any 
polynomial in V^(X). To construct such a basis, we apply the stable completion |13j 
procedure. This approach was followed in 18:. However, the basis is further orthogo- 
nalized by using a modified Singular Value Decomposition (SVD) orthonormalization 
approach introduced in [55] . 

Suppose vi, . . . ,Vs are a set of orthonormal vectors in M^, where s G a new 
basis is constructed such that 

s s 

■^^^iJ'^i^ j = i'j -^^dijUi, j = a+l,...,s, 

i=l 1=1 

where Ci,j, di,j G M and for some a G Z+. We desire that the new discrete HB vector 
i/jj to be orthogonal to V^{X), i.e. 

N 

(2.1) ^r[fc]^,[fc]=0, 

fc=i 

for all r G V^{X). Notice that the summation and the vectors r and ipj are in the 
same order as the entries of the set X. 

Due to the orthonormality of the basis {fijf^i this implies that Equation (|2.1[) is 
satisfied if the vector . . . , di,s\ belongs to the null space of the matrix 

M,,p := Q"V, 

where the columns of Q are a basis for V^(X) (i.e. all the polynomial moments) and 
V = [vi,V2, . . . ,Vs]. (Notice that the order of the summation is done with respect to 
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the set X). Suppose that the matrix M^ p is a rank a matrix and let Us^pDg pVg^p be 
the SVD decomposition. We then pick 



(2.2) 



Co,i 

C0,2 



Ca,l 
Ca,2 



da+lA 
da+l,s 



ds.l 
ds,2 



H 



where the colmiins a + 1, . . . , s form an orthonormal basis of the nuUspace J\f{Ms,p). 
Similarly, the columns 1, . . . , a form an orthonormal basis of R^\Af{Ms^p). 

Remark 4. // {vi, . . . , Vs} is orthonormal, then new basis {0i, . . . , (/)a, tpa+i, • ■ • , 
V's} is orthonormal, and spans the same space as span{vi, . . . ,Vs}. This is due to the 
orthonormality of the matrix Vg.p- 

Remark 5 . If s is larger than the total number of vanishing moments, then M, 



is guaranteed to have a nullspace of at least rank s — M{p), i.e. 
s — M{p) orthonormal vectors {ipi} that satisfy Equation \2.1i 



there exist at least 



2.3. Finest Level. We can now build an orthonormal multi-resolution basis. 
First, choose a priori the order of moments p and start at the highest level resolution 
level n. The next step is to progressively build the adapted HB as the levels are 
traversed. 

As 



At the finest level n, for each cube € Cn let Vi := Ci for all e; 
described in the previous section, the objective is to build new functions 



i=l 



^k,l ■= X! '^n,t,l,kVi, I = an,k + 1, . . . , S, 



i=l 



such that Equation (|2.ip is satisfied. 

The first step is to form the matrix M"jJ^ := Q^V, where the columns of Q are 
a basis for V^iX). Notice that since ei[w] = for w ^ i and G BjJ, then only \B'^\ 
columns of are need to form the matrix M^p and the rest can be thrown away 
since they multiply with zero. 



The next step is to apply the SVD procedure such that i\f "jf 



The coefficients c 

cin.k ■= rankM, 



'n.ij,k and C^n,z,j,fc ^^^^^^ .^.^^^^^^^^ ^^^^^^ ^^^^ ^^..^ ^ s,p 

k := vankMs^p. 

Now, for each e denote CJ^ as the collection of basis vectors {0^ ^ 
'^fea„fc}' ^^"^ similarly denote Z)^ as the collection of basis vectors {V'fco^fc+i 
tpk s}- Furthermore, we define the detail subspace 



Jjn,k r^n,kTrn.k 
s.p s,p s,p 

are then obtained from the rows of V"«*' and 



and the average subspace 

F,":=span{0ti,-.-,0fc,a„,J- 
By collecting the transformed vectors from all the cubes in C„, we form the subspaces 
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where © is a direct sum and IC{i) :— {k \ ^ B^}. 

Remark 6. We first observe that = © W"". This is true since the number 
of interpolating nodes is equal to N and = span{ei, 62, ... , e^v}. 

Remark 7. It is possible that WJ} = for some particular cube B^. This will 
be the case if the cardinality ofB^ is less or equal to M{p) i.e. the dimension of the 
nullspace of Mg^p is zero. However, this will not be a problem. As we shall see in 
section \2.4[ the next set of HB are built from the vectors in CJ} and its siblings. 

Lemma 2.6. The basis vectors ofV" and form an orthonormal set. 

Proof First notice that since n = whenever k ^ I then _L V^'\ 

T^n,k j_ p^nj j^j^j Yn,k j_ y^n,i _ r^^L^ ^gg^j^ fohows from the fact that the rows K";*^ 

form an orthonormal set. □ 

It is clear that the detail subspace W"" ± V'^{x), but the average subspace 
is not. However, we can stiU perform the SVD procedure to further decompose 
y . To this end we need to accumulate the average basis vectors of V^. For each 
B^-^ e Bn-i identify the set childreniB^-^). Form the set B;!-^ := {Cf | B," e 
children{B]^^^)} . We can now apply the SVD procedure on each set of average vectors 
InBr^. 

2.4. Intermediate Level. Suppose we have the collection of sets B^. for all 
k e IC{i). For each B^. perform the matrix decomposition Ml'p — Ul'pDl'^Vg;p for all 
w e B^. From the matrix V^'^ obtain the decomposition 

'^h ■=^(^i,]A,kVj, / = 1, . . . ,aj,fc, tpl^i ■.^'^dij^i^kVj, / = ai,A; + 1, . . . ,s, 

where the coefficients Cij^i^k and dij^i^k are obtained from the rows of V^p in (|2.2I) 
and Oi^k '■= rankMJ'p. Then we form the subspaces 

Wl := span{V'fc,a..,+i,---,V'fc,J, V^fc := span{(j)l^, . . . , (jj^^^J, 

and 

It is easy to see that 1/*+^ — T^* © VF*. The basis vectors are collected into two groups: 
Definition 2.7. For each B\ G B"^ let the sets CI := {^^ 1, . . . , ^l- a }: ^^^^ 

Just as for the finest level case, we can further decompose V^. To this end, for 
each Bir^ e B^-i identify the set children{B]r^). Form the set B^^^ := {C] \ B\ G 
children{B'^^)] . 

2.5. Coarsest Level. It is clear that when the iteration reaches the basis 
function no longer annihilates polynomial moments of order p. However, a new basis 
can be obtained that can vanish polynomial moments m ^ p. 

Recall that for the RBF interpolation problem with polynomial order m it is 
imposed that u ± V"^{X). li p = m then it is clear that u G © . . . VF" 
and RBF problem decouples as shown in Section [TJ However, if p > m then u G 
'PP{X)\V'^{X) © © . . . W^" and the RBF problem does not decouple. It is then 
of interest to find an orthonormal basis to 



VP{X)\V"\X). 
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This can be easily achieved. Let the cohimns of the matrix 









92(^1) 








qp[xi) 










g2(-'?7v) 


■) ' ' ' 1 


_ qfa{xN) 


7 ■ • • ; 







be a basis for PP{X), where each function qi{x) corresponds to the moment. Now, 
the first M{m) columns correspond to a basis for Thus an orthonormal basis 

for VP{X)\P"^{X) is easily achieved by applying the Gram-Schmidt process. 

Alternatively the matrix Mo,m can now be formed by applying the SVD decom- 
position and a basis that annihilates all polynomial of order m or lower is obtained. 
The matrix Cg can now be replaced with the matrix [Cq^ , Dq^], where the columns 
of Cq-^ fo rm an orthonormal basis for 'P"^{X) and Dq ^ is an orthonormal basis for 
pP{X)\V"'{X). 

The complete algorithm to decompose into a multi-resolution basis with re- 
spect to the interpolating nodes X is described in Algorithms [2] and [3l 

Input: Finest level n; Order of RBF m; B-j, Vfc G IC{j),j = — 1 . . .n; 

BjJ Vfc € IC{n); Degree of vanishing moments p ^ m; X. 
Output: {Co\D^\Dl . . . , D]^} 
main; 

for j •(— n to 1 step 1 do 
for fc 1 to \IC{j - 1)1 do 
I Bf ' ^ 
end 

for fc ^ 1 to \IC{j)\ do 

{Dl, 01} ^ PolyOrtho(Bi, p); 
U <r- parent{Bj,) ; 
forall the e U do 

I Bj-'^Bf^U CI 
end 
end 
end 

{D~\ C~^} ^ PolyOrtho(B[],m); 

Algorithm 2: Adapted Discrete HB Construction 
Lemma 2.8. 

= V° ®W° (B ...W = span{Co\D^\D^, 

for j — . . .n and for all k G /C(j) 

Proof. The result follows from Remark Inland that Vi is decomposed into V^~^ © 
W'-^ for alH = 1 . . . n. □ 

Remark 8. When Algorithm [J| terminates at level i — 0, there will be M{p) 
orthonormal vectors that span PP{X). 

Remark 9. At the finest level n, the number of vectors in each matrix 
corresponding to BjJ is bounded by M{p). Now, for each B^!^^ there are at most 8M(p) 
vectors from the children of B^^^ . From the procedure for the basis construction in 
section for each B^^^ there are at at most M{p) vectors in CJ! . Furthermore, 
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there are no more than 8M{p) vectors in D^^ formed. The same conclusion follows 
for each B|., for all levels i — 0, . . . , n. 

Input: B^, Degree of vanishing moment p 
Output: Dl CI 

forall the Vi e B|, do 

I V^[V, V,] 
end 

Form the N x M{p) polynomial basis matrix Q; 
Mi:^ ^ Q"V; 

aj,fc <— rank of -D^'^; 
for Z <— 1 to Oj.k do 

for I 1 to s do 

I ^k,l ^ ^k.l + C3.i,LkVi 

end 

CI ^ [CI 
end 

for I <— aj,k + 1 to s do 
for i <^ 1 to s do 

I ^Ll ^ ^k.l + dj,r,l,kVi 

end 
end 

Algorithm 3: PolyOrtho(B^,p) 

Definition 2.9. For any B'^,, k e JC{i), i = 0, . . .n, let jB^ be the number of 
vectors in B^. 

Theorem 2.10. The complexity cost for Algorithm\^is bounded by 0{N{n + \)). 

Proof. Suppose we start at the finest level n. Now, for each box in i?^, the vectors 
Ci G B^ have at most one non-zero entry. This implies that the matrix Af"jf = Q^V ^ 

is a M(j)) X \B^\ matrix and V is at most a \B^\ x matrix. Then the total 
cost to computing M"p^ for all k e lC{n) is bounded by 

C \Bk?M(j>) 

for some C > 0. Now since |Ufcg;c(n)-S^| = ^ and \B^\ is at most M{p) Vfc G )C{n), 
then the cost for computing CJ^ and Vfc e /C(n), is at most 0{N). 

At level n — 1, from Remark |9] we see that there are at most 8M(p) vectors in 
each Vfc e /C(n - 1). Forming the the matrix M^'^^'' = Q"V, Q" is at most 

M{p) X \Bl-^\ and V is at most x |B^-i|. Now,' since \iJkeK(n-i)Bl'^\ = 

it follows that the cost for computing M""^'*^, Vfc G /C(n — 1) , is at most 0{N). 
Furthermore, we have from Remark [3] that IB)!"^] ^ 8Af(p), Vfc G K.{n - 2). 
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Since for each level i, |Ui.g^(i)i3^| = N, then the total cost of computing MJ'p, 
Vfc e IC{i), is at most 0{N) and |B^"i| ^ 8M{p), Vfc e IC{i - 1). The resuh follows. 
□ 



2.6. Properties. The adapted HB construction has some interesting properties. 
In particular, the space can be decomposed in a series of nested subspaces that are 
orthogonal to V^{X) and the basis forms an orthonormal set. As a side benefit, this 
series of nested subspaces can be used to prove the uniqueness of the RBF interpolation 
problem. One important property of the adapted HB is presented in the following 
lemma. 

Lemma 2.11. The basis of described by the vectors of {Cq^,Dq^, D^, 
Di, . . . , D^}, j = . . .n, k £ IC{j) form an orthonormal set. 

Proof. We prove this by a simple induction argument. Assume that for level i the 
set of vectors {B|.} are orthonormal. Since the rows of the set V^^p are orthonormal 
and B; n = whenever I ^ k, then it follows that the vectors Llk(rK:{i){Cl, D^} 
form an orthonormal basis. The result then follows from Lemma 12.61 □ 

Definition 2.12. Given a set of unisolvent interpolating nodes X <zM.^ with re- 
spect to 'P^(M^), we form the matrix P from the basis vectors {Cq^ , Dq^ , -Dq, Di, . . . , 

From Lemmas 12.81 and 12.111 the matrix P has the following properties 

1. If w G V^{X) then P^v has dim{C^) non-zero entries. 

2. PP" = P"P = I. 

One of the immediate implications of the existence of the HB with p ^ m vanishing 
moments is that it can be used to show the uniqueness of the RBF interpolant. 

Theorem 2.13. Suppose that the set of interpolating nodes X is unisolvent with 
respect to VP{M!^) and the kernel matrix K{-, ■) satisfies Definition \1.2\ then the RBF 
Problem hl.'^] has a unique solution. 

Proof. Since we have shown the existence of a matrix P that annihilates any 
polynomial in V'^{X)^ then the result follows from the argument given in section [01 
□ 

3. Multi- Resolution RBF Representation. The HB we constructed above 
is adapted to the kernel and the location of the interpolation nodes. It also satisfies 
the vanishing moment property. The construction of such an HB leads to several 
important consequences. First, we can use the basis to prove the existence of a 
unique solution of the RBF problem, but more importantly, this basis can be used to 
solve the RBF problem efficiently. 

As the reader might recall from section II. 1[ the construction of the adapted HB 
decouples the polynomial interpolant from the RBF functions if the order of the 
vanishing moments p is equal to the order of the RBF polynomial interpolant m. 
This simple result can be extended if p ^ m. 

Theorem 3.1. Suppose X is unisolvent with respect to u and u solves the inter- 
polation problem \1.2\ uniquely, where u _L 7^™ and the kernel satisfies Definition 
If the number of vanishing moments p ^ m then 



( C^KC^ CfKT \(s \_(Cfd 

for some s G R^^-" and w G where T := [D^,D^, . . . ,-Dfc], Cj. = D^^ and 

O = dim(P"(X)). 
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Proof. Since u £ VP{X)\V"'{X) (BW° ® ... then u = C^s + Tw for some 
s e R*^-° and w e where O = dim{V'^ {X)) . Replacing u into (HH)), pre- 

multiplying by [CfT"]" and recalhng that C_L,r ± 'P"'(X) the resuh follows. □ 

Once u is solved, c is easily obtained by solving the set of equations L^Qc = 
L"{d - Ku), where L"Q G r'^Hp)^m(p)_ 

There are two ways we can solve this, since L and Q span the same space and 
have full column rank, then Q is invertible and 

{L"Q)-^L"{d- Ku). 

Alternatively, we can define the interpolation problem in terms of the basis vectors in 
L directly i.e. Q := L, which leads to 

c = L"{d~ Ku). 

For the rest of this section we describe the algorithms for solving the previous 
system of equations. The entries of the matrix 

^ _ f CfKCi_ CfKT \ 
■- ^ tHj^(Ji_ T^KT ) 

are formed from all the pairwise matching of any two vectors 'ipk rm'4'i g from the set 
V := {D^^,D^, D\,..., Dl). The entries of Kw take the form 

(3-2) E E E E K{F,{ea).F,{e,))^l^[F,{ea)]^l^[F,{e,)l 

keK(n) k'eK(n) eaGB^ e^eB;;, 

Notice that the summation is over all the vectors Co s.t. o = 1,...,N. However, 
the entries of tpl „ are mostly zeros, thus in practice the summation is over all the 
non-zero terms. 

Continuing with the same notation, the entries of d\y :— Td have the form 
E E ^lAFii^-)]f{Fp{ea)). 

Since w — Pu and u 1. VP{X), then entries of w have the form 

E E 4'lm[F,{ea)HF,{ea)l VV^I,„ G P. 

keK(n) e„eB^? 

It is clear that from the set V the matrix Kw is ordered such that the entries of any 
row of K\Y sums over the same vector ipl e T>. In Figure [3] a block decomposition of 
the matrix Kw is shown. 

One interesting observation of the matrix Kw is that most of the information 
of the matrix is contained in a few entries. Indeed, for integral equations it can be 
shown that an adapted HB discretization matrix requires only 0{Nlog{N)^-^) entries 
to achieve optimal asymptotic convergence |18j . This has been the approach that was 
followed behind the idea of wavelet sparsification of integral equations [9j [TJ [2l [181 13 

However, it is not necessary to compute the entries of Kw for efficiently invert- 
ing the matrix, but instead we only have to compute matrix vector products of the 
submatrices K^^ in 0{N) or 0{Nlog{N)) computational steps. 
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w 



k: 



w 



K. 



0,0 



Fig. 3.1. Organization of the linear system Kyi/w = d. The block matrices consist of all 

the summations in Eauation \3.S\ for ciH 4'k m^^i g ^ ^ ^^'^^ belong to level i and j. The vectors d\^ 
correspond to all inner products of -ip^^ ^ G T> at level i. Similarly for w, where w = Tu. 



3.1. Pre-conditioner. One key observation of the matrix Kw is that each of 
the blocks is very weh conditioned. Our experiments indicate that this is the case 
even for highly non uniform placement of the nodes. We propose to use two kinds 
of preconditioners on the decoupled RBF problem: a block SSOR and a diagonal 
preconditioner based on the multi-resolution matrix Kw- The block SSOR multi- 
resolution preconditioner shows better iteration counts and is a novel approach to 
preconditioning. However, in practice, the simplicity of the diagonal preconditioner 
makes it easier to code and is faster per iteration count for the size of problems in 
which we are interested. 

The block SSOR preconditioner on the decoupled RBF takes the form of the 
following problem: 



(3.3) p-^Kww = p-^dw, 



where Kw ^ Lw + Dw + and 





' 








" 




'^w 













J^w 
















t^w 









Lw — 










and Dw — 












L J^w 




Kw 


. 













Kw 



The block preconditioner is constructed as P = {Lw + Dw)I^w^^w + Dw)- 

We can solve this system of equations with a restarted GMRES (or MINRES 
since the matrices are symmetric) iteration [48] . To compute each iteration efficiently 
we need each of the matrix vector products of the blocks K]^ to be computed with 
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a fast summation method. We have the choice of either computing each block as 
matrix-vector products from a fast summation directly, or a sparse preconditioner 
that can be built and stored. 

3.1.1. Fast Summation. It is not necessary to compute the matrix Kw di- 
rectly, but to employ approximation methods to compute matrix-vector products 
Kwctw efhciently. To such end we make the following assumption. 

Assumption 2. Let yi,y2, ■ . ■ ,yNi e M.^, ci, C2, . . . cjvi e M, Rbf span{ 
K{x, j/i), K{x, y-i) . . . , K{x, yNi)), T = span{4>i, 02, ■ • ■ , 4>q\j for some set of lin- 
early independent functions 0i , (/)2 , • . • , 0q • We are interested in the evaluation of the 
RBF map 

'^(^) ^c^K{x,y^), 

i=l 

where x G M^. Suppose there exists a transformation F{(j){y)) : Rbf T with 0{Ni) 
computational and storage cost. Moreover, any successive evaluation of F((j){y)) can 
be performed in 0(1) operations and 

\F{^{-))-m^CFA' 



where p G Z"*" is the order of the fast summation method, A = kil, > and 

a> 1. 

There exist several methods that satisfy, or nearly satisfy. Assumption [2j In 
particular we refer to those based on multi-pole expansions and the Non-equidistant 
Fast Fourier Transform [6l |45l [59] . 

The system of equations p.3p can now be solved using an inner and outer iteration 
procedure. For the outer loop a GMRES algorithm is used, where the search vectors 
are based on the matrix P~^Kw- 

The inner loop consists of computing efficiently the matrix-vector products P^^KiyciW) 
for some vector aw G ■ This computation is broken down into two steps: 

Step One To compute efficiently Kwotw for each matrix vector product K^Oy^, we 
fix ^pl. „j from Equation p.2|) and then transform the map 



J2 E Kixa,yb)^ijF,{F^\yt))]ai 



1*/ GD3 Vb&X 



for all the vectors V'i'g G ^■' i into a new basis 02, • • ■ , </'g}- The computational 
cost for this procedure is O(A^i), where Ni corresponds to the number of non-zero 



entries of all ipf g & F>^ ■ Since the computational cost for evaluating the new basis on 
any point Xa G A is 0(1), then the total cost for calculating each row of A'^a^^ is 



0(Ai -|- A2), where A2 is equal to all the non-zero entries of ipl. , 



Now, since for each j — 0, 



N, then Ai is bounded by CN 



for some C > 0. For the same reason A2 is also bounded by CN. This implies that 
the total cost for evaluating the matrix vector products Kwctw is O {{n + 1)^A). 
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Step Two: The computation of P ^f^w, where /3w 
three stages. First, let -fw ■— {L^ + Dw)^^ Pw ^ then 



K\yoiw is broken up into 



if, 



w 







K, 



w 

2,2 



W 






w 



K 



w 



Since {L^ + Dw) has a block triangular from, we can solve the inverse-matrix vector 
product with a back substitution scheme. Suppose that we have solved , . . . , 7^^, 
then it is easy to see from the triangular structure of {L^/ + Dw) that 



a 



w 



k=l 



The cost for evaluating this matrix vector product with a fast summation method is 
0{{n + l)^iV + k{n + 1)N). The last term comes from the block matrices in Dw, 
which are inverted indirectly with k Conjugate Gradient (CG) iterations [29l[25]. In 
Section |4] we show numerical evidence that k converges rapidly for large numbers of 
interpolating nodes. 

The second matrix vector product, 77 := Dwlw is evaluated in 0{N) using a fast 
summation method. Finally the last matrix vector product /i :— {Lw + Dw)~^'r]w 
can be solved in 0{{n + 1)'^N + knN) by again using a back substitution scheme. 

Remark 10. For many practical distributions of the interpolating nodes in the 
set X , the number of refinement levels n-\-\ is hounded by Ci log N '61. For these types 
of distributions the total cost for evaluating P^KwUw is 0{Nlog^N) assuming k is 
bounded. 

This approach is best for large scale problems where memory becomes an issue 
and for large vanishing moments. For small to medium size problems the blocks K^^ 
can be computed in sparse form and then stored for repeated use. 

3.1.2. Sparse Pre-conditioners. In this section we show how to produce two 
types of sparse preconditioners by leveraging the ability of HB to produce compact 
representations of the discrete operator matrices. 

The key idea is to produce a sparse matrix P of P from the entries of the blocks 

. This is done by choosing an appropriate strategy that decides which entries to 
keep, and which ones not to compute. 

Although it is possible to construct an accurate approximation of P and Kw for 
all the blocks {i,j = 0...n), the computational bottleneck lies in computing the 
matrix vector products with {K^)~^. Thus it is sufhcient to compute the sparse di- 
agonal blocks of Kw . The off-diagonal blocks are computed using the fast summation 
method described in section [3. 1.1 1 

Definition 3.2. For every vector V"*: m ^ ^ '"^'^ associated support box 

e B, define the set to be the union of B™ and all boxes in Bj that share a 
face, edge or corner with i.e. the set of all adjacent boxes. 

To produce the sparse matrix P we execute the following strategy: For each entry 
in Ky^ corresponding to the adapted HB vectors V'fcm' ^\ g ^ O'^ly compute 

this entry if 
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(3.4) dtst{LlLl) inf HL^f) - Ll{y)\\LHm) ^ n,^, 

x,y 

where Ti^i G for i = . . .n. For an appropriate distance criterion r^.i we can 
produce a highly sparse matrix that is close to (and respectively P) in a 
matrix 2-norm sense. 

Definition 3.3. The distance criterion Tij is set to 



(3.5) T,,, 2"-', 

With this distance criterion it is now possible to compute a sparse representation 
of the diagonal blocks of Kw- This can be seen from the following lemma. 

Lemma 3.4. Let Ba he a hall in 9? with radii ra centered around the midpoint 
a G R'^ of the hox . Similarly, let Bf, be a hall in MP with radii ri, centered around 
the midpoint b of the box Bf . Suppose tpl ^ and ipj ^ satisfy the vanishing moment 

condition Equation for all r G V^{X), and B^^ C Ba, Bj C Bb- If K{x,y) 
satisfies Assumptions^ then the following estimate holds: 

NC Iv+DrP^'^ 

^^^"•"^'^''S'^^ dlst{Ba,Bi,)^+iP+^) pi 

where a('0fc m^'^l q) e.qual to 



(3-6) E E E E ^(i^p(ea),^;,(ej,))Vl,„[i^,(e,)]V'i';,[F,(efa)]. 

fceK(n) k'£lC{n} e„eB^ ^beB"^, 

Proof. Since we assume that K{x, y) is analytical, then by Taylor's theorem we 
have that for every x G Ba 

K{x.y)^ E ^'^^l^'^ i^-aT + Rcix^y). 

|a|=0 

where (x — a)" := (xi — ai)°'^{x2 — a2)°'^{x^ — as)"^, a! := ai!a2!a3!, 



(3.7) i?a(x,y):= E ^-^—^f D^Kf{a + s{x-a),y)ds 

11.1 Jo 

\a\=p+l 



and s G [0, 1]. Now recall the vanishing moment property of the basis, then Equation 
(13. 6p becomes 

E E E E i?a(^i(ea),Fp(e,))V^,„[i^,(fa)]V/,,[^^g(y6)]- 
keK(n) k'eK(n) eaE'B^ cteB^, 

Now, let z = a + s{xa ~ a). By applying the chain rule and Assumption [T] we obtain 

\D'iK{x-,y)\^ 
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Since z ^ Ba and y ^ Bf, then 



(3-8) \D'iK(x,y)\i^ - — — 

where dist{Ba, Bi,) := inis^Ba,yi£Bb W ~ y\^ i-^- the minimum distance between the 
two balls Ba and Bf,. Replacing equation (|3.8I) into (13.71) . then \Ra{xa, yb) \ is bounded 

by 



(3.9) \RJxa,yb)\ V 7 r-TT 

^ ' ' "^"'^^^^i^ dist(Ba,Bb)i+\'^\ 

|q|=P+1 

< g ip + 

~" dist{Ba,BbY+iP+^) p\ 

From the orthonormality of f/;^, and -0/^ the result follows. 
□ 

The previous lemma shows the decay of the entries of the matrix Kw is dependent 
on the distance between the respective blocks and the number of vanishing moments. 
If p is chosen sufficiently large (for a biharmonic p = 3 is sufficient), the entries of 
Ky/ decay polynomially fast, which leads to a good approximation to Kw- 

Under this sparsification strategy, it can be shown that \\Kw — Kw\\2 decays 
exponentially fast as a function of the number of vanishing moments p with only 
0(iV(n+ 1)^) entries in K^/- The accuracy results have been derived in more detail in 
an upcoming paper we are writing for anisotropic spatially varying RBF interpolation 

m- 

Lemma 3.5. Let N(yl) : M^^^ — > M+, he the number of non-zero entries for the 
matrix A, then we have 



s" ds 



(3.10) N(i^^') s; 8Af (p)73iV 

Proof. First, identify the box L\. ^ that embeds ipl, ^ and the distance criterion 
Ti^i associated with that box. Now, the number of vectors ipl ^ and corresponding 
embedding L\ ^ that intersect the boundary traced by t is equal to (2^*3 + 2Ti^i + 
2-i+i)3/2-» ^ 23(*-*)73 = 7^ (as shown in Figure [X^ . From Remark [S] there are at 
most 8M{p) HB vectors per cube. The result follows. □ 

To compute the block diagonal entries of K^, for i = 0, . . . , n in 0{N log N) 
computational steps, we employ a strategy similar to the fast summation strategy in 
section [3. 1.1 1 For each row of KIy, locate the corresponding HB from Equation 
(|3.2p and transform the map 



(3.11) E E E K{Fp{x,eaMlMiea)] 

keK(n) k'eK.{n) CaGB^ 

into an approximation G(x, V'fcm) '■= * by applying a fast summation 

method that satisfies Assumption [51 Any entry of the form a('0fcm'^ig) '^^^ be 
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computed by sampling G{x) at locations corresponding to the non-zero entries of 
ipl g , and the sampled values can be used to multiply and sum through the non-zero 
values of g. 

Theorem 3.6. Each block is computed in 0{N) steps. 

Proof. The cost for computing the basis of G{x) corresponding to V'fe.m most 
0{NI where is the number of non zeros of i/'^. Now, since |Ufcg;c(i)^A;| — ^ 
the cost of computing G{x, ipl ^) for all the vectors ipl ^ at level i is keicii) 

For each row in iiT^, from Lemma [3.51 there is at most 8M(p)7^ entries. This 
implies that for each vector ,„ we need only 0(1) evaluations of G[x,^\^^) to 
compute a row of Ky^. Now, if we sum up the cost of evaluating G{x,ipl. ,„) for all 
the rows then the total cost for evaluating is 0{N). □ 

Remark 11. For each entry in K^, the corresponding basis vectors i^km'''l'ig 
can he found in 0{n) computational steps. This is easily achieved by sorting the set 
of cubes {i3/};eK;.j=i,...,n with an octree structure, i.e. a parent-child sorting. 

Remark 12. Note that further improvements in computation can be done by 
observing that tpl, is a linear combination of the vectors 4>\~^ G . Thus equation 
i3.11\) can be written as a linear combination of 



(3.12) E E E K(x,F,{eaM-'[F,{ea)]. 

keKin) k'£K(n) CaGB^ 

// two vectors "0^ ^ and ip]. are in the same cube then it is sufficient to compute 
equation i3.11\) once and apply the coefficients computed in the construction of the 
entire HB. In addition, if two vectors tp'^, and tp^, share the same vector (j)] o ^ 
V^~^ , the same procedure can be applied. In our results in Section^ we apply this 
scheme to compute the SSOR and diagonal blocks. 

Remark 13. As our results show a very simple diagonal preconditioner can be 
built from the blocks of . In particular 

\ 



J 

This preconditioner is also much easier to construct in practice. 

4. Numerical Results. In this section we apply the multi-resolution method 
developed in section[3]to the RBF interpolation and regression problem. These will be 
of different sizes and polynomial orders for the biharmonic, multiquadric and inverse 
multiquadric function kernel in M?. These kernels can be written in a common form 
K[r) := (r^ + 5"^)'-/'^, where r := i5 G R and Z g Z. The RBF solver is also tested 
on a generalized least squares and Kriging problem. The distribution of the nodes in 
X are separated into three cases. 

Test Case 1: We test our method on several sets of randomly generated interpo- 
lating nodes in the unit cube in R^^ as shown in Figure a). The sets of interpolating 
nodes {Xi, . . . , Xr} vary from 1000 to 400,000 nodes. Each set of interpolating nodes 
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Fig. 3.2. Distance criterion cut-off boundary for the cube L^. , corresponding to all the vectors 
^km^ D^. Assume j ^ i and h = , and each cube B^, is evenly divided by Bj . With this in 
mind, the cut-off criterion traces a cube of length 2Tij plus the side length of . For any vector 
tpj ^ such that Lj crosses the cut-off boundary, we compute the corresponding entries in the matrix 

. As V increases, by the error estimates (3.91) . the entries a{ipl, V'^g) decay as a polynomial 
of p with respect to the distance vh^ . 




Fig. 3.3. Example of sparsity pattern for Kw created by applying the distance criterion to each 
of the blocks of Khi . 
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(a) 






(b) 



Fig. 3.4. (a) RBF interpolating set: Interpolating set with a thousand nodes for Test Case 
1. The function value defined on each node is mapped from the color bar. (b) RBF interpolating 
set: V-plane example interpolating set with four thousand nodes. The interpolating nodes are 
obtained by projecting the data nodes from Test Case 1 onto two intersecting planes. The function 
values on each node is randomly assigned. 
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is a subset of any other set with bigger cardinahty, i.e., Xi C The function 

values on each node are also grouped into r sets {61, . . . , 6,.} with randomly chosen 
values and satisfy also hi C bj+i. 

Test Case 2: For this second test we apply a projection of the data nodes generated 
in Test Case 1 onto two non-orthogonal planes K'^, then remove any two nodes that 
are less than 10~^ distance from each other. The V-plane intersecting are shown in 
Figure inm^b) . Note, that only about 0.1% of the centers were eliminated and the 
number of nodes in the table is approximate. 

Test Case 3: For the third test, we still use interpolating nodes in the unit cube 
in 3D. But instead of having the nodes uniformly distributed as in Test Case 1 and 2, 
we use a bimodal Gaussian distribution with a standard deviation of 0.25 and mean 
vectors of (0,0.5,0.5) and (1,0.5,0.5), respectively (see Figure 1575]) . For the right- 
hand-side we assume that the observations come from a third-degree polynomial, 
contaminated by a random noise with the covariance matrix defined by the inverse 
multiquadric RBF. This is an example where the RBF problem can be recast as a 
special case of the generalized least squares problem, also known as Kriging |33l [36] . 
In this type of problems, the order of the polynomial set V"^{X) is associated with 
the dimension of independent variables in the regression model. Hence, the ability of 
the RBF solver to deal with higher-order polynomials become essential, if a higher- 
dimensional regression model for Kriging is to be considered. 

Test Setup: The implementation of the multi-resolution discrete HB method is 
performed in C++. The GMRES algorithm is incorporated from PETSc (Portable, 
Extensible Toolkit for Scientific computation) libraries [3] into our C++ code. Inner 
and outer iterations are solved using a GMRES algorithm with 100-iteration restart. 
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In the rest of this section when we refer to GMRES iterations, we imply restarted 
GMRES with a restart for every 100 iterations. Since the preconditioned system will 
introduce errors in the RBF residual of the original Problem ll.2[ the accuracy of the 
GMRES is adjusted such that the residual e of the unpreconditioned RBF system is 
less than 10"'^. For the results shown in Tables I4.2[ 14.31 14.41 and 14. 5[ we show the 
GMRES accuracy residual. 

All the numerical tests with a fast summation method are performed with a single 
processor version of the Kernel-Independent Fast Multipole Method (KIFMM) 3D 
code (http: / /mrl.nyu.edu/'~ harper/kifmmSd/documentation/index.html). This code 
implements the algorithm described in [59]. The accuracy is set to relative medium 
accuracy (10"*^ to 10"^). 

All simulations were performed on a single core on the TACC Lonestar cluster 
(http:/ /www. tacc.utexas.edu/services/uscrguidcs/loncstar/). Each node has two pro- 
cessors. Each is a Xeon 5100 series 2.66GHz dual-core processor with a 4MB unified 
(Smart) L2 cache. Peak performance for the four cores is 42.6 GFLOPS. 

Test Examples: 

• Condition number k of underlying system of equations with respeet to sealing 
all the domain. One immediate advantage our method has over a direct 
method is the invariance of the conditioning of the system of equations with 
respect to the scale of the polynomial domain. This is a consequence of the 
construction of the HB polynomial orthogonal basis. 

Removing the polynomial source of ill-conditioning makes the system easier to 
solve. The condition number of the full RBF interpolation matrix is sensitive 
to the scaling of the domain. We show this by scaling the domain with respect 
to the polynomial interpolation and fix the scaling for the RBF part: 



K aQ 
aQ^ O 



As shown in Table ITT] the condition number for a 1000 center problem with 
m — 3 and (f)(r) = r deteriorates quite rapidly with scale a. In particular, for 
a scaling of 1000 or larger an iterative method, such as GMRES or CG, stag- 
nates. We note that the invariance of the condition number of the decoupled 
system was also observed in (SJ [S^ and proven in [5] . 

Another important observation is that the same result will apply for a niulti- 
quadric, or inverse multiquadric of the form = {r"^ + 5'^)^\ S gM., due to the 
polynomial decoupling from the RBF matrix. In general, this will be true for 
any strictly conditionally positive (or negative) definite RBF. However, the 
matrix Kw will still be subject to the underlying condition number of K. In 
other words, if k{K) deteriorates significantly then Kw will be ill-conditioned 
also. 



Scale 


a = 1 


a = 10 


a = 100 


a = 1000 


a = 10000 


K of RBF system 


9 X 10^ 


8 X 10^ 


8 X 10" 


8 X IQi-* 


9 X 10^** 


k{Kw) 


762 


762 


762 


762 


762 



Table 4.1 

Condition number for RBF system matrix ( equation versus scale of the problem for a 

thousand nodes for Test Case 1 with respect to the biharmonic (t>{r) := r. As observed, increasing 
the scale by alpha the condition number deteriorates very rapidly. In particular, for a condition 
number higher than the reciprocal of machine position and the GMRES or CG algorithm stagnates. 
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Block CG iteration count for p = 3 and m — 3 




50000 100000 
Number of interpolating nodes 



150000 



Fig. 4.1. Inner CG block iterations for the case p = 3 and m = 3 and outer GMRES iteration 
counts for multiple cases. As observed the inner CG iteration count, at 10~® residual relative error, 



for each block increases rapidly with size, 
0, . . . , 4 lead to the same iteration counts. 



but eventually stabilizes. The sparse blocks , i 



• Biharmonic RBF, m = 3 (cubic) and p = 3 This is an example of a higher 
order polynomial RBF interpolation. We test both the SSOR and diagonal 
pre-conditioner on Test Case 1 & 2. For the pre-conditioner the accuracy of 
the GMRES outer iterations is set such that the residual e := HiTvi/w — dn/jj ^ 
10~^. For the pre-conditioner the number of iterations to invert blocks K^, 
i — 0, . . . ,n with an accuracy of 10"^ are shown in Figure 14.11 Due to the 
condition number of the blocks they quickly converge with either a CG, or 
a GMRES solver. Moreover, the number of iterations appear to grow very 
slowly with size. 

From the results in Table 221 (a) for Test Case 1, the number of restarted 
GMRES iterations grows as 0{N'~'-'^^). But, observe that for the changes from 
N = 4,000 to N = 8,000 and from N = 32,000 to N = 64,000 the complexity 
increases by a factor of 5. We suspect that the Level 1 and 2 cache are used 
less efficiently at this point due to the size of the problem. This increase is 
shown both in the cost per iteration and inner block computation. Fitting 
a linear regression function to the log-log plot leads to a growth of N^-^ for 
time complexity. 

In Table 14.21 (b) the iteration and timing results for the sparse SSOR pre- 
conditioner for test cast 2 (v-plane) are shown. This is a much harder problem 
due to the corner and the projection of the random data from Test Case 1 
onto two planes at 135 degrees from each other. The total GMRES iteration 
grows as C7V° '^'*, which leads to a N'^-^Hog'^N time complexity. 
In Table 14.21 (c) the iteration and timing results for the diagonal precondi- 
tioner with Test Cast 1 (uniform cube) are shown. We can observe that 
although the GMRES iteration count is higher than that of the SSOR pre- 
conditioner {CN^'^^), the simplicity of the pre-conditioner allows every matrix- 
vector product to be computed much faster. Fitting a line to the log data 
leads to a total time complexity increases of CN^-^"^. The memory constraints 
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are also much lower than the SSOR since only N entries are needed to be 
stored for the pre-conditioner. It requires only 5613 seconds to solve the GM- 
RES iteration for the 100,000 center problem. The diagonal pre-conditioner 
is built in 7747 seconds and the total time complexity is 3.8 hours. 
In Table l42l fd). the iteration and timing results for the diagonal pre-conditioner 
with test cast 2 (v-plane) are shown. This is a harder problem than Test Case 
1 and can be reflected in the time complexity increase {CN^-^*} and the GM- 
RES iteration count (CA^^ ''^). Another observation is that the time required 
to compute the diagonal pre-conditioner is about one half compared to Test 
Gase 1. This is due to the adaptive way we compute the diagonal, recall 
Remark [H 

Biharmonic RBF, m = (constant), p = 3 From Table we can observe 
that the results for Test Case 1 and 2 are almost the same as for case m = 
3,p = 3 and the GMRES iterations are almost identical. This indicates that 
the efficiency of our method is invariant under polynomial interpolation. 
Biharmonic RBF m = 1 (linear) and p = 3. For this case we only present 
the diagonal pre-condition since overall it is more effective than the SSOR. 
In Table 113] (a) the iterations and timing results for Test Case 1 are shown. 
An intriguing result is that the number of GMRES iterations are exactly the 
same for each case to = 0, 1 or 3. The iteration and total timing results are 
also very close and have almost the same growth. This is also observed in the 
iterations and timing results for diagonal-precondition for Test Case 2. 
Multiquadric and inverse multiquadric RBF, m — 3 (cubic), p = 3. For the 
case of the multiquadrics with 6 = 0.01, the iteration count increases signifi- 
cantly, as shown in Table H75F a). The number of GMRES iterations increases 
as CiV°-^^. This is a harder problem to solve due to the ill-conditioning in- 
troduced by the constant term S, as reflected by the increase in the number 
of GMRES iterations. Fitting a line through the log-log plot of the total time 
leads to a CN^'^^ time complexity. Although if we increase the number of 
interpolating nodes, this would probably match with the biharmonic case. 
In contrast, the inverse multiquadrics result shown in Table l475l b) is a better 
conditioned problem leading to around the same complexity as for the bihar- 
monic case, but the constant is lower. We note that to achieve comparable 
interpolation accuracy, the value of S for the inverse multiquadric generally 
needs to be larger than for the multiquadric case. And the larger the S the 
more ill-conditioned the RBF interpolation problem. 

Regression In this part we demonstrate how our code can be used to compute 
a regression problem. Let (fl, T, P) be a complete probability space. Suppose 
that e G x is a random vector such that E[e] =0 and E[e£^] = K, where 
the covariance matrix K is assumed to be symmetric positive definite. Given 
a set of observations (realizations) Y, the regression problem is modeled as 



The generalized least squares solution is c = [Q^ K ^Q) ^Q^K [55] . 



N 




or in matrix form 



y ^ Qc + e. 
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N 


HB (s) 




GMRES 


GMRES (^) 


Total (s) 


1000 


67 


3 


6 


1 


75 


2000 


68 


12 


8 


2 


93 


4000 


70 


31 


11 


4 


143 


8000 


99 


259 


15 


21 


665 


16000 


123 


1024 


20 


36 


1857 


32000 


137 


3612 


26 


81 


5855 


64000 


224 


10388 


38 


387 


25321 


100000 


326 


19662 


45 


500 


42486 


(a) SSOR Pro-conditioner Test Case 1 


N 


HB (s) 




GMRES 


GMRES (^) 

V if.p.r 1 


Total (s) 


1000 


69 


70 


13 


2 


92 


2000 


70 


80 


19 


3 


142 


4000 


70 


101 


25 


17 


528 


8000 


79 


398 


34 


37 


1589 


16000 


91 


1146 


55 


102 


6699 


32000 


110 


3756 


87 


230 


23722 



(b) SSOR Pro-conditioner Tost Case 2 



N 


GMRES 


GMRES residual 


Itr (s) 


Diag.(s) 


Total (s) 


1000 


33 


1.22E-04 


4 


70 


74 


2000 


45 


5.18E-05 


11 


73 


84 


4000 


64 


2.73E-05 


34 


83 


117 


8000 


90 


1.15E-05 


145 


217 


362 


16000 


126 


4.98E-06 


287 


507 


794 


32000 


188 


1.84E-06 


965 


1084 


2049 


64000 


281 


1.16E-06 


4080 


3583 


7664 


100000 


319 


6.68E-07 


5613 


7747 


13361 


200000 


491 


2.46E-07 


18135 


21716 


39851 


400000 


710 


1.21E-07 


68921 


61916 


130837 


(c) Diag. Pre. Test Case 1 


N 


GMRES 


GMRES residual 


Itr (s) 


Diag.(s) 


Total (s) 


1000 


73 


5.74E-05 


7 


68 


75 


2000 


102 


2.26E-05 


23 


71 


94 


4000 


168 


6.97E-06 


69 


78 


146 


8000 


272 


4.73E-06 


247 


156 


403 


16000 


348 


1.18E-06 


569 


297 


866 


32000 


921 


2.45E-07 


3776 


650 


4425 


64000 


1075 


1.26E-07 


7225 


1877 


9102 


100000 


2209 


6.39E-08 


27248 


3434 


30683 



(d) Diag. Pre. Test Case 2 



Table 4.2 

Results for biharmonic 4>{r) := r, m = 3 (Cubic), p = 3. (a) Iteration and timing results for 
the sparse SSOR pre- conditioner for test cast 1 (uniform cube). The first column is the problem size 
round up to the lOOO's; the second the time in seconds to compute the HB. The third is to compute 
the sparse inner blocks K^; the fourth is the number of iterations such that e ^ 10~^ and the fifth is 
the time per GMRES iteration, (b) Iteration and timing results for the sparse SSOR pre- conditioner 
for test cast 2 (v-plane). (c) Iteration and timing results for diagonal pre- conditioner for test cast 
1 (uniform cube). The first column is the problem size; the second is the number of iterations such 
that e ^ 10~^; the third column is the GMRES residual needed to achieve e ^ 10~^ accuracy; the 
fourth is the total time for the GMRES iteration and the fifth is the total time in seconds to compute 
the diagonal of all inner blocks K]j^. The last column is the total time in seconds, (d) Iteration and 
timing results for diagonal pre- conditioner for test cast 2 (v-plane). 
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N 


GMRES 


GMRES residual 


Itr (s) 


Diag.(s) 


Total (s) 


1000 


33 


1.22E-04 


4 


68 


72 


2000 


45 


5.18E-05 


11 


71 


82 


4000 


65 


2.73E-05 


34 


83 


117 


8000 


89 


1.15E-05 


130 


216 


345 


16000 


126 


4.98E-06 


287 


505 


792 


32000 


188 


1.84E-06 


968 


1081 


2050 


64000 


281 


1.16E-06 


4060 


3588 


7648 


100000 


319 


6.68E-07 


5514 


7746 


13261 


200000 


491 


2.46E-07 


18135 


16417 


35807 


400000 


710 


1.21E-07 


68976 


61829 


130806 


(a) Diag. Pre. Test Case 1 


N 


GMRES 


GMRES residual 


Itr (s) 


Diag.(s) 


Total (s) 


1000 


77 


5.78E-05 


7 


68 


76 


2000 


103 


2.26E-05 


23 


71 


94 


4000 


174 


6.97E-06 


71 


78 


147 


8000 


271 


4.73E-06 


246 


156 


427 


16000 


374 


1.18E-06 


612 


296 


884 


32000 


901 


2.45E-07 


3313 


650 


3966 


64000 


1183 


1.26E-07 


7858 


1876 


9734 


100000 


2173 


6.39E-08 


26904 


3436 


30341 



(b) Diag. Pre. Test Case 2 



Table 4.3 

Results for biharmonic <j>{r) := r, m = (constant), p = 3. (a) From this test case we observe 
that the number of iterations for the GMRES iteration is exactly the same as for m = 3. There are 
differences in the construction of the diagonal-preconditioner and the matrix vector products with 
Kw However, the total time complexity is almost the same, (b) Iteration count and timing results 
for pre- conditioner for Test Case 2(V-plane). 



However, this is the same solution for Problem 11.21 [39] if we assume that 
K{r) := (j){r) and K is symmetric positive definite. 

The regression can now be extended to incorporate information on the resid- 
ual. First, Define a linear predictor y{x) as 

N 
i=l 

where 7 G and then minimize the mean squared error E[{y{x) — y{x))'^\. 
We can then derive a Best Linear Unbiased Estimator (BLUE) y*{x), 

N 

1=1 

where Ci and Ui are solutions to the RBF interpolation problem (jl.2p with 
di = y{xi) and K{r) = (j){r) |33| . Of course, this regression model will only be 
valid for RBF kernels that satisfy the positive definiteness of the covariance 
matrix R. We note that many decaying RBFs, such as Gaussian and inverse 
multiquadric, satisfy this condition [37) . 
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N 


GMRES 


GMRES residual 


Itr (s) 


Diag.(s) 


Total (s) 


1000 


33 


1.22E-04 


4 


68 


74 


2000 


45 


5.18E-05 


11 


71 


84 


4000 


65 


2.73E-05 


34 


78 


116 


8000 


89 


1.15E-05 


130 


156 


343 


16000 


126 


4.98E-06 


288 


297 


793 


32000 


188 


1.84E-06 


951 


651 


2043 


64000 


281 


1.16E-06 


4062 


358 


7642 


100000 


319 


6.68E-07 


6011 


7746 


13750 


200000 


491 


2.46E-07 


16412 


19378 


35792 


400000 


710 


1.21E-07 


68962 


61769 


130731 


(a) Diag. Pre. Test Case 1 


N 


GMRES 


GMRES residual 


Itr (s) 


Diag.(s) 


Total (s) 


1000 


77 


5.74E-05 


7 


68 


76 


2000 


103 


2.26E-05 


23 


71 


94 


4000 


170 


6.97E-06 


69 


78 


147 


8000 


298 


4.73E-06 


271 


156 


427 


16000 


357 


1.18E-06 


588 


297 


884 


32000 


901 


2.45E-07 


3315 


651 


3966 


64000 


1183 


1.26E-07 


7856 


1877 


9734 


100000 


2173 


6.39E-08 


26851 


3433 


30285 



(b) Diag. Pre. Test Case 2 



Table 4.4 

Results for biharmonic (t>{r) := r, m = 1 (linear), p = 3. (b) Iteration count and timing results 
pre- conditioner for Test Case 2(V-plane). 



Here for illustration purposes, we assume that the covariance matrix is defined 
by the inverse multiquadric RBF, K{r) = (5(r^ + 5'^)~^ . The observations are 
simulated as 

(4.1) Y{x,) = \\x,\\l+e{x,). 

In Table 14. 6[ we show the timing result of solving the Kriging regression 
problem for Test Case 5 with 5 = 0.01 using our proposed method for diagonal 
pre-conditioner. The result shows linear growth of total time with respect to 
the problem size. Although for a larger coefficient 5 the system can become 
ill-conditioned and our solver would not be as efficient. Note that due to the 
difficulty of generating large random vectors with RBF covariance matrix, we 
limited the size to 16,000 centers. 

In the BLUE estimation problem, in general it is generally assumed that the 
canonical representation of the covariance matrix is known, but the scale 5 is 
unknown. The BLUE regression problem now becomes solving for y and the 
coefficient 5 such that 

5* — ar grains E[{y{x,S) — v{xf'\- 

We test the BLUE estimating problem assuming that the observation data is 
generated by the noise model in Equation (j4.1[) with (5 = 0.01. The regression 
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N 


GMRES 


GMRES residual 


Itr (s) 


Total (s) 


1000 


40 


1.67E-04 


2 


72 


2000 


57 


3.94E-05 


13 


84 


4000 


92 


2.31E-05 


36 


117 


8000 


130 


8.94E-06 


156 


303 


16000 


197 


3.40E-06 


603 


902 


32000 


341 


6.53E-06 


1977 


2527 


64000 


683 


9.23E-07 


3899 


5400 


100000 


852 


2.96E-07 


9845 


13003 


(a) 


N 


GMRES 


GMRES residual 


Itr (s) 


Total (s) 


1000 


9 


3.12E-04 


1 


70 


2000 


7 


2.59E-03 


1 


73 


4000 


14 


9.52E-04 


5 


90 


8000 


18 


8.73E-04 


15 


180 


16000 


27 


1.08E-03 


43 


402 


32000 


63 


l.OlE-03 


192 


912 


64000 


85 


8.62E-04 


682 


2735 


100000 


109 


4.37E-04 


1440 


5830 



(b) 



Table 4.5 ^ 
Iteration and timing results for diagonal pre- conditioner, multiquadric (f>{r) := (r^ + 0.01^)^ 3 , 
and test case 1 (uniform cube), m = 3, p = 3 (a) Multiquadric: The second column is the number 
of GMRES iterations, the third is the time it takes to achieve e C 10~^ accuracy and the last is the 
total time to build the pre- conditioner, compute the HB basis and solve the iterations, (b) Inverse 
multiquadric: The column data is the same as for (a). 



model is now performed with different orders as shown in|4l The estimate and 
the Mean Square Error of the estimate on the unit cube [0, 1]"^ at = 0.5 
is shown side by side for m = (constant), m = 1 (linear) and m = 2 
(quadratic) orders. 

We can see in Figure HKa) that with m = 0, the interpolated regression 
result deviates significantly from the underlying smooth signal ||a;||f . As we 
increase to, as shown in Figure |4l^b)(c), we start to recover the smooth signal 
with m = 2, we have a recovered signal that is almost free of noise. The 
improvement as to increases is also evident from the rapid decrease of the 
expected point-wise mean squared error, as shown in the plots from the right 
column. All the BLUE estimate results were performed with the Kriging 
toolbox [33] . 

5. Conclusions. In this paper we construct a class of discrete HB that are 
adapted both to the RBF kernel function and the location of the interpolating nodes. 
The adapted basis has two main advantages: First the RBF problem is decoupled, thus 
solving the scale dependence between the polynomial and RBF interpolation. Second 
with a block SSOR scheme, or a simple diagonal matrix built from the multi-resolution 
matrix Kw^ an effective pre-conditioner is built that reduces significantly the iteration 
count. Our result shows a promising approach for many RBF interpolation problems. 

Further areas of interest as future work: 
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N 


GMRES 


GMRES residual 


Total (s) 


1000 


48 


1.48E-04 


71 


2000 


95 


6.71E-05 


88 


4000 


134 


3.16E-05 


132 


8000 


150 


2.71E-05 


268 


16000 


274 


1.85E-06 


782 



Table 4.6 



Iteration and timing results for diagonal pre- conditioner, inverse multiquadric kernel (/'(r) := 

0.01{r'^ +0.01'^)"', and Test Case 3 (bimodal Gaussian distributed centers), m = 3, p = 3. The 
second column is the number of GMRES iterations, the third is the GMRES residual needed to 
achieve e ^ 10""^ accuracy, and the last is the total time to compute the HB basis, build the pre- 
conditioner and solve the iterations. 




(c) 

Fig. 4.2. Interpolation results from the Kriging regression model with a correlation defined by 
the inverse multiquadric function. The figures in the left column show the interpolation result on a 
40 by 40 mesh on the cross-section of a unit cube [0, 1]^ at = 0.5. The figures in the right column 
show the point-wise mean squared error (MSE) on the sampling mesh. The results are computed for 
varied space V"^{X) of independent regression variables: (a) m = (constant), (b) m = I (linear), 
(c) m = 2 (quadratic). 
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• Theoretical bounds of Convergence rates The next step for the method that 
we have developed is to develop convergence rate estimates of the multi- 
resolution RBF matrix with the diagonal preconditioner. 

• Spatially varying anisotropic kernels. An interesting observation is that the 
adapted discrete HB leads to a sparse multi-resolution RBF matrix represen- 
tation for spatially varying kernels. This type of RBF interpolation has been 
gaining some interest lately due to the ability to better steer each local RBF 
function to increase accuracy. Due to the spatially varying kernel, we can- 
not use a fast summation method to optimally compute each matrix vector 
product. However, preliminary results show that we can sparsify the RBF 
matrix while retaining high accuracy of the solution. Full error bounds and 
numerical results will be described in a following paper that we are currently 
writing. 

• Extension to Integral Equations: The approach described in this paper can 
also be applied to integral equations. The pre-conditioner would be built in 
the same fashion. 

After we submitted this paper we became aware of the H-Matrix approach by 
Hackbusch [11 applied to stochastic capacitance extraction [GT. We have observed 
that the authors in this paper have built sparse ©(TV log A^) representations of the 
inverse operator for an electrostatic kernel. Although many (non-oscillatory) integral 
equation problems we are aware of have asymptotically decaying kernels, in principal 
this approach could be applied to RBF interpolation problems with increasing kernels 
that satisfy Assumption [T] Indeed, the approach for Gaussian Regression in [10^ has 
many similarities with RBF interpolation. 

There are many similarities between the H-Matrix and Hierarchical Basis ap- 
proaches. The H-Matrix factorizes the operator matrix in a hierarchy of rectangu- 
lar boxes and approximate each block with a low rank matrix. HB builds a multi- 
resolution basis with polynomial vanishing moments and sparsifies the matrix operator 
by applying a change of basis. Both approaches lead to sparse matrix representations 
and can be used for pre-conditioning [40l |41] [12] . 

One observation is that the H-matrix approach would still have the coupling 
between the polynomial and RBF components, which could still lead to a poorly 
conditioned system. One way around this problem is to combine the H-Matrix and 
HB approaches by decoupling the system with HBs and then applying the H-matrix 
to compute a sparse form of Ky^^ . 
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